---
title: "Python for Data Handling and Automation"
description: "From a JavaScript developer's perspective, learn the powerful features of Python for file handling, regular expressions, and data analysis (Pandas)."
---

## 1. Introduction

### Python: The Swiss Army Knife for Data Processing

While JavaScript (especially Node.js) is capable of handling files and executing scripts, Python has a more powerful and mature ecosystem for data processing, scientific computing, and automation tasks. For many developers, this is one of the main reasons to learn Python.

This module will show you how Python easily handles various data processing challenges, from simple file I/O to complex data analysis.

**Core Concept Analogy**

| Python | JavaScript | Main Function |
| --- | --- | --- |
| **Built-in file operations** | `fs` module | File reading and writing |
| **`pathlib`** | `path` module | Cross-platform path operations |
| **`re` module** | `RegExp` object | Regular expressions |
| **`pandas`** | `lodash`, `danfo.js` | High-performance data analysis |
| **`csv` module** | `csv-parser`, `papaparse` | CSV file reading and writing |

> 💡 **Learning Strategy**: Think of these Python tools as a "super-upgrade" to your existing JavaScript/Node.js workflow. You'll find Python to be more adept at handling structured data and large datasets.

## 2. File Operations

Python provides a very concise and intuitive syntax for file operations.

### 2.1 Reading and Writing Files

<PythonEditor title="File Read/Write Comparison" compare={true}>
```javascript !! js
// JavaScript (Node.js)
const fs = require('fs');

// Write file
fs.writeFileSync('hello.txt', 'Hello from Node.js!');

// Read file
const content = fs.readFileSync('hello.txt', 'utf8');
console.log(content);

// Async read/write
fs.writeFile('hello_async.txt', 'Async hello!', (err) => {
  if (err) throw err;
  fs.readFile('hello_async.txt', 'utf8', (err, data) => {
    if (err) throw err;
    console.log(data);
  });
});
```

```python !! py
# Python
# Write file
with open("hello.txt", "w", encoding="utf-8") as f:
    f.write("Hello from Python!")

# Read file
with open("hello.txt", "r", encoding="utf-8") as f:
    content = f.read()
    print(content)

# pathlib (the more modern way)
from pathlib import Path

Path("hello_pathlib.txt").write_text("Hello from Pathlib!", encoding="utf-8")
content_pathlib = Path("hello_pathlib.txt").read_text(encoding="utf-8")
print(content_pathlib)
```
</PythonEditor>

### 2.2 Handling JSON Files

<PythonEditor title="JSON File Handling Comparison" compare={true}>
```javascript !! js
// JavaScript
const data = { name: "Alice", age: 30 };

// Write to JSON file
fs.writeFileSync('data.json', JSON.stringify(data, null, 2));

// Read from JSON file
const rawData = fs.readFileSync('data.json');
const parsedData = JSON.parse(rawData);
console.log(parsedData.name); // Alice
```

```python !! py
# Python
import json

data = {"name": "Alice", "age": 30}

# Write to JSON file
with open("data.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=2)

# Read from JSON file
with open("data.json", "r", encoding="utf-8") as f:
    parsed_data = json.load(f)
    print(parsed_data["name"])  # Alice
```
</PythonEditor>

## 3. Regular Expressions

Python's `re` module provides powerful regular expression capabilities.

<PythonEditor title="Regular Expression Comparison" compare={true}>
```javascript !! js
// JavaScript
const text = "My email is test@example.com";
const regex = /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/;
const match = text.match(regex);
if (match) {
  console.log(match[0]); // test@example.com
}

// Replace
const newText = text.replace(/email/g, 'address');
console.log(newText); // My address is test@example.com
```

```python !! py
# Python
import re

text = "My email is test@example.com"
regex = r"\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b"
match = re.search(regex, text);
if match:
    print(match.group(0))  # test@example.com

# Replace
new_text = re.sub(r"email", "address", text)
print(new_text)  # My address is test@example.com
```
</PythonEditor>

## 4. Pandas: The Powerhouse of Python Data Analysis

`pandas` is the core library of the Python data science ecosystem. It provides two powerful data structures, `DataFrame` and `Series`, which make data cleaning, transformation, analysis, and visualization incredibly simple.

### 4.1 Series: One-Dimensional Data

A `Series` is like a labeled array.

<PythonEditor title="Pandas Series Basics" compare={true}>
```javascript !! js
// JavaScript (Array-like)
const data = [10, 20, 30, 40];
data.forEach(d => console.log(d));

const filteredData = data.filter(d => d > 20);
console.log(filteredData); // [30, 40]
```

```python !! py
# Python (Pandas Series)
import pandas as pd

data = pd.Series([10, 20, 30, 40])
print(data)

filtered_data = data[data > 20]
print(filtered_data)
```
</PythonEditor>

### 4.2 DataFrame: Two-Dimensional Data

The `DataFrame` is the core of `pandas`. It is a two-dimensional tabular data structure, which can be thought of as a container for `Series`.

<PythonEditor title="Pandas DataFrame Basics" compare={true}>
```javascript !! js
// JavaScript (Array of objects)
const users = [
  { name: 'Alice', age: 30, city: 'New York' },
  { name: 'Bob', age: 25, city: 'Paris' },
  { name: 'Charlie', age: 35, city: 'London' },
];

// Filtering
const youngUsers = users.filter(u => u.age < 30);
console.log(youngUsers);

// Adding a new column
users.forEach(u => u.is_senior = u.age > 30);
console.log(users);
```

```python !! py
# Python (Pandas DataFrame)
import pandas as pd

data = {
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [30, 25, 35],
    'city': ['New York', 'Paris', 'London']
}
df = pd.DataFrame(data)

# Filtering
young_users = df[df['age'] < 30]
print(young_users)

# Adding a new column
df['is_senior'] = df['age'] > 30
print(df)
```
</PythonEditor>

### 4.3 Reading Data from CSV

`pandas` can very conveniently read data from various data sources (CSV, Excel, SQL, etc.).

<PythonEditor title="Reading from CSV Comparison" compare={true}>
```javascript !! js
// JavaScript (Node.js with csv-parser)
const fs = require('fs');
const csv = require('csv-parser');

const results = [];

fs.createReadStream('data.csv')
  .pipe(csv())
  .on('data', (data) => results.push(data))
  .on('end', () => {
    console.log(results);
  });
```

```python !! py
# Python (Pandas)
import pandas as pd

# Assume there is a data.csv file
# name,age,city
# Alice,30,New York
# Bob,25,Paris

df = pd.read_csv("data.csv")
print(df)

# For very large files, you can read in chunks to save memory
# chunk_iter = pd.read_csv("large_data.csv", chunksize=1000)
# for chunk in chunk_iter:
#     process(chunk)

# Example data analysis
print("\nAverage age:", df["age"].mean())
print("\nUsers from New York:")
print(df[df["city"] == "New York"])
```
</PythonEditor>

## 5. Summary

Python's capabilities in data processing and automation go far beyond this, but this module has laid a solid foundation for you. You now know how to:

*   Perform basic file operations using Python.
*   Use the `re` module for powerful text matching and processing.
*   Use the `pandas` library for efficient data analysis.

These skills will open a new door for you, allowing you to use Python to solve a variety of complex data problems and write powerful automation scripts.
